Asymptotic distribution and sparsistency for `1 penalized parametric M-estimators, with applications to linear SVM and logistic regression
نویسندگان
چکیده
Since its early use in least squares regression problems, the `1-penalization framework for variable selection has been employed in conjunction with a wide range of loss functions encompassing regression, classification and survival analysis. While a well developed theory exists for the `1-penalized least squares estimates, few results concern the behavior of `1-penalized estimates for general loss functions. In this paper, we derive two results concerning penalized estimates for a wide array of penalty and loss functions. Our first result characterizes the asymptotic distribution of penalized parametric Mestimators under mild conditions on the loss and penalty functions in the classical setting (fixed-p-largen). Our second result explicits necessary and sufficient generalized irrepresentability (GI) conditions for `1-penalized parametric M-estimates to consistently select the components of a model (sparsistency) as well as their sign (sign consistency). In general, the GI conditions depend on the Hessian of the risk function at the true value of the unknown parameter. Under Gaussian predictors, we obtain a set of conditions under which the GI conditions can be re-expressed solely in terms of the second moment of the predictors. We apply our theory to contrast `1-penalized SVM and logistic regression classifiers and find conditions under which they have the same behavior in terms of their model selection consistency (sparsistency and sign consistency). Finally, we provide simulation evidence for the theory based on these classification examples. ∗Indiana University, [email protected], [email protected] †Renmin University of China, [email protected] ‡University of California, Berkeley, [email protected]
منابع مشابه
Asymptotic distribution and sparsistency for l1 penalized parametric M-estimators, with applications to linear SVM and logistic regression
Since its early use in least squares regression problems, the l1-penalization framework for variable selection has been employed in conjunction with a wide range of loss functions encompassing regression, classification and survival analysis. While a well developed theory exists for the l1-penalized least squares estimates, few results concern the behavior of l1-penalized estimates for general ...
متن کاملPositive-Shrinkage and Pretest Estimation in Multiple Regression: A Monte Carlo Study with Applications
Consider a problem of predicting a response variable using a set of covariates in a linear regression model. If it is a priori known or suspected that a subset of the covariates do not significantly contribute to the overall fit of the model, a restricted model that excludes these covariates, may be sufficient. If, on the other hand, the subset provides useful information, shrinkage meth...
متن کاملPenalized Likelihood-type Estimators for Generalized Nonparametric Regression
We consider the asymptotic analysis of penalized likelihood type estimators for generalized non-parametric regression problems in which the target parameter is a vector valued function defined in terms of the conditional distribution of a response given a set of covariates. A variety of examples including ones related to generalized linear models and robust smoothing are covered by the theory. ...
متن کاملGeneralized Nonparametric Regression via Penalized Likelihood
We consider the asymptotic analysis of penalized likelihood type estimators for generalized non-parametric regression problems in which the target parameter is a vector valued function defined in terms of the conditional distribution of a response given a set of covariates, A variety of examples including ones related to generalized linear models and robust smoothing are covered by the theory. ...
متن کاملAsymptotic Normality in Generalized Linear Mixed Models
Title of dissertation: ASYMPTOTIC NORMALITY IN GENERALIZED LINEAR MIXED MODELS Min Min Doctor of Philosophy, 2007 Dissertation directed by: Professor Paul J. Smith Statistics Program Department of Mathematics Generalized Linear Mixed Models (GLMMs) extend the framework of Generalized Linear Models (GLMs) by including random effects into the linear predictor. This will achieve two main goals of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009